Cohort Building Using the MIDRC Data Commons¶


This notebook briefly demonstrates how to use the MIDRC open APIs to build a cohort of MIDRC imaging studies using patient clinical data and AI-research-based annotations in the MIDRC data commons and then access and view the X-ray image files associated with those imaging studies.

All cohort selection possible in the MIDRC data explorer UI can also be achieved programmatically using API requests. In this notebook, we'll select the same cohort as in the data explorer demo detailed in these slides.

by Chris Meyer, PhD

Manager of Data and User Services at the Center for Translational Data Science at University of Chicago

Presented at the MIDRC RSNA 2023 Deep Learning Lab on November 28, 2023

1) Set up Python environment¶


Download an API key file containing your credentials¶


  1. Navigate to the MIDRC data portal in your browser: https://data.midrc.org.
  2. Read and accept the DUA (if you haven't already).
  3. Navigate to the user profile page: https://data.midrc.org/identity
  4. Click on the button "Create API Key" and save the credentials.json file somewhere safe

Set local variables¶


Change the following cred variable path to point to your credentials file downloaded from the MIDRC data portal following the instructions above.

In [1]:
cred = "/Users/christopher/Downloads/midrc-credentials.json" # location of your MIDRC credentials, downloaded from https://data.midrc.org/identity by clicking "Create API key" button and saving the credentials.json locally
api = "https://data.midrc.org" # The base URL of the data commons being queried. This shouldn't change for MIDRC.

Install / Import Python Packages and Scripts¶

In [2]:
## The packages below may be necessary for users to install according to the imports necessary in the subsequent cells.

import sys
#!{sys.executable} -m pip install
#!{sys.executable} -m pip install --upgrade pandas
#!{sys.executable} -m pip install --upgrade --ignore-installed PyYAML
#!{sys.executable} -m pip install --upgrade pip
#!{sys.executable} -m pip install --upgrade gen3
#!{sys.executable} -m pip install pydicom
#!{sys.executable} -m pip install --upgrade Pillow
#!{sys.executable} -m pip install psmpy
#!{sys.executable} -m pip install python-gdcm --upgrade
#!{sys.executable} -m pip install pylibjpeg --upgrade
In [3]:
## Import Python Packages and scripts

import os, subprocess
import pandas as pd
import numpy as np
import pydicom
from PIL import Image
import glob
#import gdcm
#import pylibjpeg

# import some Gen3 packages
import gen3
from gen3.auth import Gen3Auth
from gen3.query import Gen3Query
/Users/christopher/Library/Python/3.9/lib/python/site-packages/urllib3/__init__.py:34: NotOpenSSLWarning: urllib3 v2.0 only supports OpenSSL 1.1.1+, currently the 'ssl' module is compiled with 'LibreSSL 2.8.3'. See: https://github.com/urllib3/urllib3/issues/3020
  warnings.warn(

Initiate instances of the Gen3 SDK Classes using credentials file for authentication¶


Again, make sure the "cred" directory path variable reflects the location of your credentials file (path variables set above).

In [5]:
auth = Gen3Auth(api, refresh_file=cred) # authentication class
query = Gen3Query(auth) # query class

2) Build Cohorts by Sending Queries to the MIDRC APIs¶

General notes on sending queries:¶

  • There are many ways to query and access metadata for cohort building in MIDRC, but this notebook will focus on using the Gen3 graphQL query service "guppy". This is the backend query service that MIDRC's data explorer GUI uses. So, anything you can do in the explorer GUI, you can do with guppy queries, and more!
  • The guppy graphQL service has more functionality than is demonstrated in this simple example. You can find extensive documentation in GitHub here in case you'd like to build your own queries from scratch.
  • The Gen3 SDK (intialized as query above in this notebook) has Python wrapper scripts to make sending queries to the guppy graphQL API simpler. The guppy SDK package can be viewed in GitHub here.
  • Guppy queries focus on a particular type of data (cases, imaging studies, files, etc.), which corresponds to the major tabs in MIDRC's data explorer GUI.
  • Queries include arguments that are akin to selecting filter values in MIDRC's data explorer GUI.
  • To see more documentation about how to use and combine filters with various operator logic (like AND/OR/IN, etc.) see this page.

Set query parameters¶


  • Here, we'll send a query to the imaging_study guppy index, which corresponds to the "Imaging Studies" tab of MIDRC's data explorer GUI.
  • The filters defined below can be modified to return different subsets of imaging studies. Here, we'll use rather restrictive parameters so the number of studies returned is small for demonstration purposes.
  • If our query request is successful, the API response should be in JSON format, and it should contain a list of patient IDs along with any other patient data we ask for.
In [7]:
### Set some "imaging_study" query parameters

## mRALE filter: we'll select all imaging studies annotated with an mRALE score greater than or equal to this threshold number
mRALE_threshold = 20

## days from study to positive COVID-19 test filter: we want imaging studies performed within two days after a positive test
min_days_from_study_to_test = -2
max_days_from_study_to_test = 0

## Imaging study modality filter: we select imaging studies with a modality of either DX or CR
study_modalities = ["DX", "CR"]

## Imaging study body part filter: here we select "chest" as the "LOINC system" filter, which is the body part examined
body_part_examined = "Chest"

## Case filters: we will select Hispanic males 70 years of age and older
ethnicity = "Hispanic or Latino"
sex = "Male"
age_threshold = 70
In [8]:
## Note: the "fields" option defines what fields we want the query to return. If set to "None", returns all available fields.

imaging_studies = query.raw_data_download(
                    data_type="imaging_study",
                    fields=None,
                    filter_object={
                        "AND": [
                            {"=": {"loinc_system": body_part_examined}},
                            {"=": {"sex": sex}},
                            {"=": {"ethnicity": ethnicity}},
                            {">=": {"age_at_index": age_threshold}},
                            {"IN": {"study_modality": study_modalities}},
                            {"nested": {"path": "imaging_study_annotations", ">=": {"midrc_mRALE_score": mRALE_threshold}}},
                            {"AND": [
                                {">=": {"days_from_study_to_pos_covid_test": min_days_from_study_to_test}}, 
                                {"<=": {"days_from_study_to_pos_covid_test": max_days_from_study_to_test}}                                
                            ]}
                        ]
                    },
                    sort_fields=[{"submitter_id": "asc"}]
                )

if len(imaging_studies) > 0 and "submitter_id" in imaging_studies[0]:
    imaging_studies_ids = [i['submitter_id'] for i in imaging_studies] ## make a list of the imaging study IDs returned
    print("Query returned {} study IDs.".format(len(imaging_studies)))
    print("Data is a list with rows like this:\n\t {}".format(imaging_studies[0:1]))
else:
    print("Your query returned no data! Please, check that query parameters are valid.")
Query returned 9 study IDs.
Data is a list with rows like this:
	 [{'_imaging_study_id': '5f7b22b2-4566-40e8-bc85-5f9ae79e9181', 'project_id': 'Open-A1', 'submitter_id': '2.16.840.1.114274.1818.514395397152296418914049330214008864917', 'case_ids': ['10008204-RwVMPdTu0EOZV6oE7Rml5Q'], 'age_at_imaging': 71, 'body_part_examined': ['PORT CHEST'], 'days_from_study_to_pos_covid_test': [28, 0], 'days_to_study': 0, 'loinc_code': '36589-0', 'loinc_long_common_name': 'Portable XR Chest AP single view', 'loinc_method': 'XR.portable', 'loinc_system': 'Chest', 'study_description': 'CHEST PORT 1 VIEW (RAD)-CS', 'study_modality': ['CR'], 'study_year_shifted': 'true', 'study_uid': '2.16.840.1.114274.1818.514395397152296418914049330214008864917', 'sex': ['Male'], 'race': ['White'], 'age_at_index': [71], 'index_event': ['First COVID test'], 'zip': ['772'], 'covid19_positive': ['Yes'], 'ethnicity': ['Hispanic or Latino'], 'dataset_submitter_id': ['ACR_20220415', 'ACR_20220218'], 'mr_series_file': 1, 'cr_series_file': 1, 'dx_series_file': 1, 'ct_series_file': 1, 'object_id': ['dg.MD1R/939b1509-5e00-485a-9b81-541e994ee77a', 'dg.MD1R/d331258b-2a2d-45a2-bd9f-d4840fab4928', 'dg.MD1R/197a2af1-1958-4e85-8fb4-37a346bcb150'], 'data_format': ['CSV', 'DCM'], 'data_type': ['MIDRC Annotation', 'DICOM'], 'data_category': ['DICOM Annotation Series File', 'CR', 'annotation_file'], 'imaging_study_annotations': [{'annotation_method': 'Retrospective_auto', 'annotator_id': 'SIFT', 'instance_uids': ['2.16.840.1.114274.1818.57232156540098663905951504146530613421'], '_annotation_id': '7349a919-2092-46a8-9452-eb8960898c40'}, {'midrc_mRALE_score': 24, '_annotation_id': '463b7844-7616-4786-a8f0-f10161bc6ea0'}], 'auth_resource_path': '/programs/Open/projects/A1'}]
In [9]:
imaging_studies_df = pd.DataFrame(imaging_studies)
display(imaging_studies_df)
_imaging_study_id project_id submitter_id case_ids age_at_imaging body_part_examined days_from_study_to_pos_covid_test days_to_study loinc_code loinc_long_common_name ... cr_series_file dx_series_file ct_series_file object_id data_format data_type data_category imaging_study_annotations auth_resource_path days_from_study_to_neg_covid_test
0 5f7b22b2-4566-40e8-bc85-5f9ae79e9181 Open-A1 2.16.840.1.114274.1818.51439539715229641891404... [10008204-RwVMPdTu0EOZV6oE7Rml5Q] 71 [PORT CHEST] [28, 0] 0 36589-0 Portable XR Chest AP single view ... 1 1 1 [dg.MD1R/939b1509-5e00-485a-9b81-541e994ee77a,... [CSV, DCM] [MIDRC Annotation, DICOM] [DICOM Annotation Series File, CR, annotation_... [{'annotation_method': 'Retrospective_auto', '... /programs/Open/projects/A1 NaN
1 15f6d9c7-5c1a-41b3-b230-7b678c3d5bcf Open-R1 1.2.826.0.1.3680043.10.474.419639.210778054999... [419639-003484] 72 [CHEST] [21, 0] 0 36572-6 XR Chest AP ... 1 1 1 [dg.MD1R/1e5c81f3-379b-4f4a-ab3e-f220e21c3d03,... [CSV, DCM] [MIDRC Annotation, DICOM] [DX, DICOM Annotation Series File, annotation_... [{'annotation_method': 'Retrospective_auto', '... /programs/Open/projects/R1 [33, 32]
2 a960e1bc-c81f-4416-ada7-1894eb154931 Open-R1 1.2.826.0.1.3680043.10.474.419639.136347983342... [419639-005966] 72 [CHEST] [1, -2] 2 36572-6 XR Chest AP ... 1 1 1 [dg.MD1R/7ac2682b-db73-44a8-89b8-d12bd7d48adc,... [CSV, DCM] [MIDRC Annotation, DICOM] [DICOM Annotation Series File, CR, annotation_... [{'midrc_mRALE_score': 24, '_annotation_id': '... /programs/Open/projects/R1 NaN
3 8c1fca33-09b8-486e-9497-7c405011d528 Open-R1 1.2.826.0.1.3680043.10.474.419639.125810573309... [419639-004486] 70 [CHEST] [14, 7, -1, -26] 26 36572-6 XR Chest AP ... 1 2 1 [dg.MD1R/99d20215-0ead-49eb-9d86-8e67eaa19217,... [CSV, DCM] [MIDRC Annotation, DICOM] [DX, DICOM Annotation Series File, annotation_... [{'annotation_method': 'Retrospective_auto', '... /programs/Open/projects/R1 [32, 31, -122, -147]
4 3a38fd7a-9375-4b16-82a3-6847a7e00754 Open-R1 1.2.826.0.1.3680043.10.474.419639.161896168092... [419639-004486] 70 [CHEST] [15, 8, 0, -25] 25 36572-6 XR Chest AP ... 1 1 1 [dg.MD1R/6bda2df1-6972-4c1b-849a-823cdddea86d,... [CSV, DCM] [MIDRC Annotation, DICOM] [DX, DICOM Annotation Series File, annotation_... [{'annotation_method': 'Retrospective_auto', '... /programs/Open/projects/R1 [33, 32, -121, -146]
5 f73ba2b6-0452-4a6a-9c1a-fb1aaca2a311 Open-R1 1.2.826.0.1.3680043.10.474.419639.140316142430... [419639-003484] 72 [CHEST] [21, 0] 0 36572-6 XR Chest AP ... 1 1 1 [dg.MD1R/8a5a00f6-4b92-4405-a1f5-a66ca756d81c,... [CSV, DCM] [MIDRC Annotation, DICOM] [DX, DICOM Annotation Series File, annotation_... [{'midrc_mRALE_score': 20, '_annotation_id': '... /programs/Open/projects/R1 [33, 32]
6 b9b5a9ee-fe52-4e42-bd4e-15c46aa7ef7e Open-R1 1.2.826.0.1.3680043.10.474.419639.118613874522... [419639-004486] 70 [CHEST] [7, 0, -8, -33] 33 36572-6 XR Chest AP ... 1 1 1 [dg.MD1R/1258c1a9-e3ea-4847-bfb4-7ea7bc1aa77e,... [CSV, DCM] [MIDRC Annotation, DICOM] [DX, DICOM Annotation Series File, annotation_... [{'annotation_method': 'Retrospective_auto', '... /programs/Open/projects/R1 [25, 24, -129, -154]
7 f7285515-be07-43bd-bc66-dab8ba6a89f2 Open-R1 1.2.826.0.1.3680043.10.474.419639.238877950213... [419639-005966] 72 [CHEST] [3, 0] 0 36572-6 XR Chest AP ... 1 1 1 [dg.MD1R/0320d6e3-9e02-472c-8194-3e530a2b5e9e,... [CSV, DCM] [MIDRC Annotation, DICOM] [DX, DICOM Annotation Series File, annotation_... [{'annotation_method': 'Retrospective_auto', '... /programs/Open/projects/R1 NaN
8 f7d154dc-1e96-47f5-bb91-8fe668350def Open-R1 1.2.826.0.1.3680043.10.474.419639.274623828911... [419639-004486] 70 [CHEST] [40, 33, 25, 0] 0 36572-6 XR Chest AP ... 1 1 1 [dg.MD1R/1bb654e6-8a1e-49c8-a88a-329ffbe75c35,... [CSV, DCM] [MIDRC Annotation, DICOM] [DX, DICOM Annotation Series File, annotation_... [{'midrc_mRALE_score': 21, '_annotation_id': '... /programs/Open/projects/R1 [58, 57, -96, -121]

9 rows × 35 columns

3) Send another query to get data file details for our cohort / case ID¶


The object_id field in each imaging study record above contains the file identifiers for all files associated with each imaging study, which could include files like third-party annotations. If we simply want to access all files associated with our list of cases, we can use those object_ids.

However, in this example, we'll ask for specific types of files and get more detailed information about each of the files. This is achieved by querying the data_file guppy index, which corresponds to the "Data Files" tab of the MIDRC data explorer GUID.

All MIDRC data files, including both images and annotations, are listed in the guppy index "data_file", which is queried in a similar manner to our query of the imaging_study index above. The query parameter data_type below determines which guppy (Elasticsearch) index we're querying.

To get only data_file records that correspond to our imaging study cohort built previously, we'll use the list of study UIDs as a query filter.

Set 'data_file' query parameters¶


Here, we'll utilize the property source_node to filter the list of files for our cohort to only those matching the type of files we're interested in. In this example, we ask only for CR and DX (x-ray) images, which will exclude any other types of files like annotations.

We're also using the property study_uid as a filter to restrict the data_file records returned down to those associated with the imaging studies in our cohort built above.

In [10]:
# Build a list of study UIDs to use as a filter in our data_file query
study_uids = [i['study_uid'] for i in imaging_studies]
study_uids
Out[10]:
['2.16.840.1.114274.1818.514395397152296418914049330214008864917',
 '1.2.826.0.1.3680043.10.474.419639.210778054999760359188417916669',
 '1.2.826.0.1.3680043.10.474.419639.136347983342456090026818822110',
 '1.2.826.0.1.3680043.10.474.419639.125810573309217143784890529197',
 '1.2.826.0.1.3680043.10.474.419639.161896168092474395209530500374',
 '1.2.826.0.1.3680043.10.474.419639.140316142430659476069988221925',
 '1.2.826.0.1.3680043.10.474.419639.118613874522385825655419409466',
 '1.2.826.0.1.3680043.10.474.419639.238877950213975218363744768567',
 '1.2.826.0.1.3680043.10.474.419639.274623828911318186126754873274']
In [11]:
# Choose the types of data we want using "source_node" as a filter
source_nodes = ["cr_series_file","dx_series_file"]
In [12]:
## Search for specific files associated with our cohort by adding "study_uid" as a filter
# * Note: "fields" is set to "None" in this query, which by default returns all the properties available
data_files = query.raw_data_download(
                    data_type="data_file",
                    fields=None,
                    filter_object={
                        "AND": [
                            {"IN": {"study_uid": study_uids}},
                            {"IN": {"source_node": source_nodes}},
                        ]
                    },
                    sort_fields=[{"submitter_id": "asc"}]
                )

if len(data_files) > 0:
    object_ids = [i['object_id'] for i in data_files if 'object_id' in i] ## make a list of the file object_ids returned by our query
    print("Query returned {} data files with {} object_ids.".format(len(data_files),len(object_ids)))
    print("Data is a list with rows like this:\n\t {}".format(data_files[0:1]))
else:
    print("Your query returned no data! Please, check that query parameters are valid.")
Query returned 10 data files with 10 object_ids.
Data is a list with rows like this:
	 [{'_data_file_id': 'a0959113-c951-4483-b1e3-08b553df3e3a', 'project_id': 'Open-A1', 'submitter_id': '2.16.840.1.114274.1818.49815354666685421105401695275387637902', 'series_uid': '2.16.840.1.114274.1818.49815354666685421105401695275387637902', 'case_ids': ['10008204-RwVMPdTu0EOZV6oE7Rml5Q'], 'object_id': 'dg.MD1R/197a2af1-1958-4e85-8fb4-37a346bcb150', 'md5sum': 'b17fe21f7fc34ba33f40c845bb47e0d2', 'file_name': '10008204-RwVMPdTu0EOZV6oE7Rml5Q/2.16.840.1.114274.1818.514395397152296418914049330214008864917/2.16.840.1.114274.1818.49815354666685421105401695275387637902.zip', 'file_size': 5370432, 'data_format': 'DCM', 'data_type': 'DICOM', 'data_category': 'CR', 'lossy_image_compression': '00', 'manufacturer': 'CARESTREAM HEALTH', 'manufacturer_model_name': 'DRX-REVOLUTION', 'modality': 'CR', 'series_description': 'AP(shutter)', 'source_node': 'cr_series_file', 'image_type': ['DERIVEDPRIMARY'], 'imager_pixel_spacing': [0.139, 0.139], 'pixel_spacing': [0.139, 0.139], 'view_position': ['AP'], 'program_name': ['Open'], 'project_code': ['A1'], '_dataset_id': ['da3a84c5-aa32-45cc-ab67-af674eb3d425', '4797be61-6666-45d6-9771-913ab5bd0163'], '_case_id': ['08908c2a-6222-4130-a3a9-f3e10577e3b5'], 'age_at_index': [71], 'covid19_positive': ['Yes'], 'ethnicity': ['Hispanic or Latino'], 'index_event': ['First COVID test'], 'race': ['White'], 'sex': ['Male'], 'zip': ['772'], '_imaging_study_id': ['5f7b22b2-4566-40e8-bc85-5f9ae79e9181'], 'age_at_imaging': [71], 'body_part_examined': ['PORT CHEST'], 'days_from_study_to_neg_covid_test': [], 'days_from_study_to_pos_covid_test': ['0', '28'], 'days_to_study': [0], 'study_description': ['CHEST PORT 1 VIEW (RAD)-CS'], 'study_modality': ['CR'], 'study_location': [], 'study_year': [], 'study_year_shifted': ['true'], 'study_uid': ['2.16.840.1.114274.1818.514395397152296418914049330214008864917'], 'loinc_code': ['36589-0'], 'loinc_contrast': [], 'loinc_long_common_name': ['Portable XR Chest AP single view'], 'loinc_method': ['XR.portable'], 'loinc_system': ['Chest'], '_annotation_id': [], 'auth_resource_path': '/programs/Open/projects/A1'}]
In [13]:
# object_id (AKA "data GUID") is a globally unique file identifier that points to an actual file object in cloud storage. We'll use the object_ids along with the gen3 command-line tool to download the files these object_ids point to.
object_ids
Out[13]:
['dg.MD1R/197a2af1-1958-4e85-8fb4-37a346bcb150',
 'dg.MD1R/d5e1b796-bc72-4f83-b44a-66323c2f0a3a',
 'dg.MD1R/a880b310-dfeb-421f-b645-f4e4b86dd66b',
 'dg.MD1R/99d20215-0ead-49eb-9d86-8e67eaa19217',
 'dg.MD1R/760101f2-b2ae-43ed-b805-5d4aebb6b9f9',
 'dg.MD1R/a7278a91-6f98-4ecf-b3ac-1d3541fb760d',
 'dg.MD1R/45b5c052-e979-4c23-bffb-0e5502760690',
 'dg.MD1R/53bfd501-2719-4caa-94e0-e43c15fa7e01',
 'dg.MD1R/7ac2682b-db73-44a8-89b8-d12bd7d48adc',
 'dg.MD1R/1258c1a9-e3ea-4847-bfb4-7ea7bc1aa77e']

4) Access data files using their object_id / data GUID (globally unique identifiers)¶


In order to download files stored in MIDRC, users need to reference the file's object_id (AKA data GUID or Globally Unique IDentifier).

Once we have a list of GUIDs we want to download, we can use either the gen3-client or the gen3 SDK to download the files. You can also access individual files in your browser after logging-in and entering the GUID after the files/ endpoint, as in this URL: https://data.midrc.org/files/GUID

where GUID is the actual GUID, e.g.: https://data.midrc.org/files/dg.MD1R/b87d0db3-d95a-43c7-ace1-ab2c130e04ec

For instructions on how to install and use the gen3-client, please see the MIDRC quick-start guide, which can be found linked here and in the MIDRC data portal header as "Get Started".

Below we use the gen3 SDK command gen3 drs-pull object which is documented in detail here.

Use the Gen3 SDK command gen3 drs-pull object to download an individual file¶

In [14]:
## Make a new directory for downloaded files
os.system("rm -r downloads")
os.system("mkdir -p downloads")
Out[14]:
0
In [15]:
## We can use a simple loop to download all files and keep track of successes and failures

success,failure,other=[],[],[]
count,total = 0,len(object_ids)
for object_id in object_ids:
    count+=1
    cmd = "gen3 --auth {} --endpoint data.midrc.org drs-pull object {} --output-dir downloads".format(cred,object_id)
    stout = subprocess.run(cmd, shell=True, capture_output=True)
    print("Progress ({}/{}): {}".format(count,total,stout.stdout))
    if "failed" in str(stout.stdout):
        failure.append(object_id)
    elif "successfully" in str(stout.stdout):
        success.append(object_id)
    else:
        other.append(object_id)
Progress (1/10): b'{"succeeded": ["dg.MD1R/197a2af1-1958-4e85-8fb4-37a346bcb150"], "failed": []}\n'
Progress (2/10): b'{"succeeded": ["dg.MD1R/d5e1b796-bc72-4f83-b44a-66323c2f0a3a"], "failed": []}\n'
Progress (3/10): b'{"succeeded": ["dg.MD1R/a880b310-dfeb-421f-b645-f4e4b86dd66b"], "failed": []}\n'
Progress (4/10): b'{"succeeded": ["dg.MD1R/99d20215-0ead-49eb-9d86-8e67eaa19217"], "failed": []}\n'
Progress (5/10): b'{"succeeded": ["dg.MD1R/760101f2-b2ae-43ed-b805-5d4aebb6b9f9"], "failed": []}\n'
Progress (6/10): b'{"succeeded": ["dg.MD1R/a7278a91-6f98-4ecf-b3ac-1d3541fb760d"], "failed": []}\n'
Progress (7/10): b'{"succeeded": ["dg.MD1R/45b5c052-e979-4c23-bffb-0e5502760690"], "failed": []}\n'
Progress (8/10): b'{"succeeded": ["dg.MD1R/53bfd501-2719-4caa-94e0-e43c15fa7e01"], "failed": []}\n'
Progress (9/10): b'{"succeeded": ["dg.MD1R/7ac2682b-db73-44a8-89b8-d12bd7d48adc"], "failed": []}\n'
Progress (10/10): b'{"succeeded": ["dg.MD1R/1258c1a9-e3ea-4847-bfb4-7ea7bc1aa77e"], "failed": []}\n'
In [16]:
# Get a list of all downloaded .dcm files
image_files = glob.glob(pathname='**/*.dcm',recursive=True,)
image_files
Out[16]:
['downloads/419639-005966/1.2.826.0.1.3680043.10.474.419639.238877950213975218363744768567/1.2.826.0.1.3680043.10.474.419639.199185633225969837235422780961/1.2.826.0.1.3680043.10.474.419639.338357498067445660994795729410.dcm',
 'downloads/419639-005966/1.2.826.0.1.3680043.10.474.419639.136347983342456090026818822110/1.2.826.0.1.3680043.10.474.419639.116290334401168239019633666124/1.2.826.0.1.3680043.10.474.419639.120300845449893409248681861328.dcm',
 'downloads/419639-004486/1.2.826.0.1.3680043.10.474.419639.161896168092474395209530500374/1.2.826.0.1.3680043.10.474.419639.241350960080622520462420009922/1.2.826.0.1.3680043.10.474.419639.145110880890384391433274781976.dcm',
 'downloads/419639-004486/1.2.826.0.1.3680043.10.474.419639.118613874522385825655419409466/1.2.826.0.1.3680043.10.474.419639.305005715360473208980309632160/1.2.826.0.1.3680043.10.474.419639.532016163989758731316920279042.dcm',
 'downloads/419639-004486/1.2.826.0.1.3680043.10.474.419639.125810573309217143784890529197/1.2.826.0.1.3680043.10.474.419639.118065679248589286894262755470/1.2.826.0.1.3680043.10.474.419639.222096163603758063408202308357.dcm',
 'downloads/419639-004486/1.2.826.0.1.3680043.10.474.419639.125810573309217143784890529197/1.2.826.0.1.3680043.10.474.419639.332211068554181067467022145158/1.2.826.0.1.3680043.10.474.419639.203380996083922564313807962791.dcm',
 'downloads/419639-004486/1.2.826.0.1.3680043.10.474.419639.274623828911318186126754873274/1.2.826.0.1.3680043.10.474.419639.251758278872580821083226828769/1.2.826.0.1.3680043.10.474.419639.129831158929213234512420327031.dcm',
 'downloads/419639-003484/1.2.826.0.1.3680043.10.474.419639.140316142430659476069988221925/1.2.826.0.1.3680043.10.474.419639.235077257085678341201661347366/1.2.826.0.1.3680043.10.474.419639.250971161552980694447122795609.dcm',
 'downloads/419639-003484/1.2.826.0.1.3680043.10.474.419639.210778054999760359188417916669/1.2.826.0.1.3680043.10.474.419639.485051506367402929274868149596/1.2.826.0.1.3680043.10.474.419639.130092207615417146244447563736.dcm',
 'downloads/10008204-RwVMPdTu0EOZV6oE7Rml5Q/2.16.840.1.114274.1818.514395397152296418914049330214008864917/2.16.840.1.114274.1818.49815354666685421105401695275387637902/2.16.840.1.114274.1818.57232156540098663905951504146530613421.dcm']

View the DICOM Images¶


Here we'll use the Python package pydicom to view the downloaded DICOM images.

Note that some of the files may contain compressed pixel data that require other packages to view; so, for this demo we'll simply skip over those using the following loop.

In [17]:
for image_file in image_files:
    print(image_file)
    ds = pydicom.dcmread(image_file)
    try:
        new_image = ds.pixel_array.astype(float)
        scaled_image = (np.maximum(new_image, 0) / new_image.max()) * 255.0
        scaled_image = np.uint8(scaled_image)
        final_image = Image.fromarray(scaled_image)
        print(type(final_image))
        display(final_image)
    except Exception as e:
        print("Couldn't view {}: {}.".format(image_file,e))
downloads/419639-005966/1.2.826.0.1.3680043.10.474.419639.238877950213975218363744768567/1.2.826.0.1.3680043.10.474.419639.199185633225969837235422780961/1.2.826.0.1.3680043.10.474.419639.338357498067445660994795729410.dcm
<class 'PIL.Image.Image'>
No description has been provided for this image
downloads/419639-005966/1.2.826.0.1.3680043.10.474.419639.136347983342456090026818822110/1.2.826.0.1.3680043.10.474.419639.116290334401168239019633666124/1.2.826.0.1.3680043.10.474.419639.120300845449893409248681861328.dcm
<class 'PIL.Image.Image'>
No description has been provided for this image
downloads/419639-004486/1.2.826.0.1.3680043.10.474.419639.161896168092474395209530500374/1.2.826.0.1.3680043.10.474.419639.241350960080622520462420009922/1.2.826.0.1.3680043.10.474.419639.145110880890384391433274781976.dcm
<class 'PIL.Image.Image'>
No description has been provided for this image
downloads/419639-004486/1.2.826.0.1.3680043.10.474.419639.118613874522385825655419409466/1.2.826.0.1.3680043.10.474.419639.305005715360473208980309632160/1.2.826.0.1.3680043.10.474.419639.532016163989758731316920279042.dcm
<class 'PIL.Image.Image'>
No description has been provided for this image
downloads/419639-004486/1.2.826.0.1.3680043.10.474.419639.125810573309217143784890529197/1.2.826.0.1.3680043.10.474.419639.118065679248589286894262755470/1.2.826.0.1.3680043.10.474.419639.222096163603758063408202308357.dcm
<class 'PIL.Image.Image'>
No description has been provided for this image
downloads/419639-004486/1.2.826.0.1.3680043.10.474.419639.125810573309217143784890529197/1.2.826.0.1.3680043.10.474.419639.332211068554181067467022145158/1.2.826.0.1.3680043.10.474.419639.203380996083922564313807962791.dcm
<class 'PIL.Image.Image'>
No description has been provided for this image
downloads/419639-004486/1.2.826.0.1.3680043.10.474.419639.274623828911318186126754873274/1.2.826.0.1.3680043.10.474.419639.251758278872580821083226828769/1.2.826.0.1.3680043.10.474.419639.129831158929213234512420327031.dcm
<class 'PIL.Image.Image'>
No description has been provided for this image
downloads/419639-003484/1.2.826.0.1.3680043.10.474.419639.140316142430659476069988221925/1.2.826.0.1.3680043.10.474.419639.235077257085678341201661347366/1.2.826.0.1.3680043.10.474.419639.250971161552980694447122795609.dcm
<class 'PIL.Image.Image'>
No description has been provided for this image
downloads/419639-003484/1.2.826.0.1.3680043.10.474.419639.210778054999760359188417916669/1.2.826.0.1.3680043.10.474.419639.485051506367402929274868149596/1.2.826.0.1.3680043.10.474.419639.130092207615417146244447563736.dcm
<class 'PIL.Image.Image'>
No description has been provided for this image
downloads/10008204-RwVMPdTu0EOZV6oE7Rml5Q/2.16.840.1.114274.1818.514395397152296418914049330214008864917/2.16.840.1.114274.1818.49815354666685421105401695275387637902/2.16.840.1.114274.1818.57232156540098663905951504146530613421.dcm
<class 'PIL.Image.Image'>
No description has been provided for this image

View the DICOM Headers¶


DICOM files have metadata elements embedded in the images. These can also be read and viewed using the pydicom package.

In [18]:
ds = pydicom.dcmread(image_files[0],force=True)
display(ds)
Dataset.file_meta -------------------------------
(0002, 0000) File Meta Information Group Length  UL: 240
(0002, 0001) File Meta Information Version       OB: b'\x00\x01'
(0002, 0002) Media Storage SOP Class UID         UI: Digital X-Ray Image Storage - For Presentation
(0002, 0003) Media Storage SOP Instance UID      UI: 1.2.826.0.1.3680043.10.474.419639.338357498067445660994795729410
(0002, 0010) Transfer Syntax UID                 UI: Explicit VR Little Endian
(0002, 0012) Implementation Class UID            UI: 1.2.826.0.1.3680043.2.1143.107.104.103.115.3.1.0
(0002, 0013) Implementation Version Name         SH: 'GDCM 3.1.0'
(0002, 0016) Source Application Entity Title     AE: 'gdcmconv'
-------------------------------------------------
(0008, 0005) Specific Character Set              CS: 'ISO_IR 100'
(0008, 0008) Image Type                          CS: ['DERIVED', 'PRIMARY']
(0008, 0016) SOP Class UID                       UI: Digital X-Ray Image Storage - For Presentation
(0008, 0018) SOP Instance UID                    UI: 1.2.826.0.1.3680043.10.474.419639.338357498067445660994795729410
(0008, 0020) Study Date                          DA: '20110508'
(0008, 0021) Series Date                         DA: '20110508'
(0008, 0022) Acquisition Date                    DA: '20110508'
(0008, 0023) Content Date                        DA: '20110508'
(0008, 0030) Study Time                          TM: '203127.524391'
(0008, 0031) Series Time                         TM: '203133.445523'
(0008, 0032) Acquisition Time                    TM: '203133.445523'
(0008, 0033) Content Time                        TM: '203133.445523'
(0008, 0050) Accession Number                    SH: '24164'
(0008, 0060) Modality                            CS: 'DX'
(0008, 1030) Study Description                   LO: 'XR CHEST 1 VIEW AP'
(0008, 103e) Series Description                  LO: 'AP'
(0010, 0010) Patient's Name                      PN: '419639-005966'
(0010, 0020) Patient ID                          LO: '419639-005966'
(0010, 0040) Patient's Sex                       CS: 'M'
(0010, 1010) Patient's Age                       AS: '070Y'
(0012, 0063) De-identification Method            LO: 'RSNA Covid-19 Dataset Default'
(0012, 0064)  De-identification Method Code Sequence  4 item(s) ---- 
   (0008, 0100) Code Value                          SH: '113100'
   (0008, 0102) Coding Scheme Designator            SH: 'DCM'
   (0008, 0104) Code Meaning                        LO: 'Basic Application Confidentiality Profile'
   ---------
   (0008, 0100) Code Value                          SH: '113107'
   (0008, 0102) Coding Scheme Designator            SH: 'DCM'
   (0008, 0104) Code Meaning                        LO: 'Retain Longitudinal Temporal Information Modified Dates Option'
   ---------
   (0008, 0100) Code Value                          SH: '113108'
   (0008, 0102) Coding Scheme Designator            SH: 'DCM'
   (0008, 0104) Code Meaning                        LO: 'Retain Patient Characteristics Option'
   ---------
   (0008, 0100) Code Value                          SH: '113109'
   (0008, 0102) Coding Scheme Designator            SH: 'DCM'
   (0008, 0104) Code Meaning                        LO: 'Retain Device Identity Option'
   ---------
(0013, 0010) Private Creator                     LO: 'CTP'
(0013, 1010) Private tag data                    LO: 'MIDRC-RICORD'
(0013, 1011) Private tag data                    LO: 'MIDRC-RICORD'
(0013, 1012) Private tag data                    LO: '419639'
(0013, 1013) Private tag data                    LO: '419639'
(0018, 0010) Contrast/Bolus Agent                LO: ''
(0018, 0015) Body Part Examined                  CS: 'CHEST'
(0018, 0060) KVP                                 DS: '90.0'
(0018, 1020) Software Versions                   LO: '1.4'
(0018, 1110) Distance Source to Detector         DS: '0.0'
(0018, 1147) Field of View Shape                 CS: 'RECTANGLE'
(0018, 1149) Field of View Dimension(s)          IS: [127741, 142753]
(0018, 1150) Exposure Time                       IS: '6'
(0018, 1151) X-Ray Tube Current                  IS: '323'
(0018, 1152) Exposure                            IS: '2'
(0018, 1153) Exposure in uAs                     IS: '2200'
(0018, 1156) Rectification Type                  CS: 'CONST POTENTIAL'
(0018, 115e) Image and Fluoroscopy Area Dose Pro DS: '0.0'
(0018, 1164) Imager Pixel Spacing                DS: [0.139, 0.139]
(0018, 1190) Focal Spot(s)                       DS: '1.2'
(0018, 1191) Anode Target Material               CS: 'TUNGSTEN'
(0018, 1405) Relative X-Ray Exposure             IS: '236'
(0018, 1411) Exposure Index                      DS: '235.73'
(0018, 1412) Target Exposure Index               DS: '200.0'
(0018, 1413) Deviation Index                     DS: '0.71'
(0018, 1508) Positioner Type                     CS: ''
(0018, 1700) Collimator Shape                    CS: 'RECTANGULAR'
(0018, 1702) Collimator Left Vertical Edge       IS: '0'
(0018, 1704) Collimator Right Vertical Edge      IS: '1027'
(0018, 1706) Collimator Upper Horizontal Edge    IS: '0'
(0018, 1708) Collimator Lower Horizontal Edge    IS: '919'
(0018, 5101) View Position                       CS: 'AP'
(0018, 7001) Detector Temperature                DS: '28.0'
(0018, 7004) Detector Type                       CS: 'DIRECT'
(0018, 7005) Detector Configuration              CS: 'AREA'
(0018, 7006) Detector Description                LT: Array of 23 elements
(0018, 7008) Detector Mode                       LT: 'PatientMode'
(0018, 7010) Exposures on Detector Since Last Ca IS: '27'
(0018, 7011) Exposures on Detector Since Manufac IS: '0'
(0018, 7012) Detector Time Since Last Exposure   DS: '0.0'
(0018, 7014) Detector Active Time                DS: '200.0'
(0018, 7020) Detector Element Physical Size      DS: [0.139, 0.139]
(0018, 7022) Detector Element Spacing            DS: [0.139, 0.139]
(0018, 702a) Detector Manufacturer Name          LO: 'CARESTREAM HEALTH'
(0018, 702b) Detector Manufacturer's Model Name  LO: 'DRXPlus3543C'
(0018, 7060) Exposure Control Mode               CS: 'MANUAL'
(0018, 8150) Exposure Time in uS                 DS: '6800.0'
(0018, 8151) X-Ray Tube Current in uA            DS: '323529.4'
(0018, 9332) Exposure in mAs                     FD: 2.2
(0020, 000d) Study Instance UID                  UI: 1.2.826.0.1.3680043.10.474.419639.238877950213975218363744768567
(0020, 000e) Series Instance UID                 UI: 1.2.826.0.1.3680043.10.474.419639.199185633225969837235422780961
(0020, 0010) Study ID                            SH: ''
(0020, 0011) Series Number                       IS: '1'
(0020, 0012) Acquisition Number                  IS: '1'
(0020, 0013) Instance Number                     IS: '1'
(0020, 0020) Patient Orientation                 CS: ['L', 'F']
(0020, 0062) Image Laterality                    CS: 'U'
(0020, 1002) Images in Acquisition               IS: '1'
(0028, 0002) Samples per Pixel                   US: 1
(0028, 0004) Photometric Interpretation          CS: 'MONOCHROME2'
(0028, 0010) Rows                                US: 2518
(0028, 0011) Columns                             US: 2813
(0028, 0030) Pixel Spacing                       DS: [0.139, 0.139]
(0028, 0034) Pixel Aspect Ratio                  IS: [1, 1]
(0028, 0100) Bits Allocated                      US: 16
(0028, 0101) Bits Stored                         US: 12
(0028, 0102) High Bit                            US: 11
(0028, 0103) Pixel Representation                US: 0
(0028, 0106) Smallest Image Pixel Value          US: 0
(0028, 0107) Largest Image Pixel Value           US: 4095
(0028, 0301) Burned In Annotation                CS: 'NO'
(0028, 1040) Pixel Intensity Relationship        CS: 'LOG'
(0028, 1041) Pixel Intensity Relationship Sign   SS: -1
(0028, 1050) Window Center                       DS: '2048.0'
(0028, 1051) Window Width                        DS: '4096.0'
(0028, 1052) Rescale Intercept                   DS: '0.0'
(0028, 1053) Rescale Slope                       DS: '1.0'
(0028, 1054) Rescale Type                        LO: 'US'
(0028, 2110) Lossy Image Compression             CS: '00'
(7fe0, 0010) Pixel Data                          OW: Array of 14166268 elements
In [19]:
# Access individual elements
display(ds.file_meta)
display(ds.ImageType)
display(ds[0x0008, 0x0016])
(0002, 0000) File Meta Information Group Length  UL: 240
(0002, 0001) File Meta Information Version       OB: b'\x00\x01'
(0002, 0002) Media Storage SOP Class UID         UI: Digital X-Ray Image Storage - For Presentation
(0002, 0003) Media Storage SOP Instance UID      UI: 1.2.826.0.1.3680043.10.474.419639.338357498067445660994795729410
(0002, 0010) Transfer Syntax UID                 UI: Explicit VR Little Endian
(0002, 0012) Implementation Class UID            UI: 1.2.826.0.1.3680043.2.1143.107.104.103.115.3.1.0
(0002, 0013) Implementation Version Name         SH: 'GDCM 3.1.0'
(0002, 0016) Source Application Entity Title     AE: 'gdcmconv'
['DERIVED', 'PRIMARY']
(0008, 0016) SOP Class UID                       UI: Digital X-Ray Image Storage - For Presentation
In [20]:
# View the dicom metadata for all files as a DataFrame
dfs = []
for image_file in image_files:
    ds = pydicom.dcmread(image_file)
    df = pd.DataFrame(ds.values())
    df[0] = df[0].apply(lambda x: pydicom.dataelem.DataElement_from_raw(x) if isinstance(x, pydicom.dataelem.RawDataElement) else x)
    df['name'] = df[0].apply(lambda x: x.name)
    df['value'] = df[0].apply(lambda x: x.value)
    df = df[['name', 'value']]
    df = df.set_index('name').T.reset_index(drop=True)
    df['filename'] = image_file
    df.drop(columns=['Pixel Data'],inplace=True) # drop the pixel data as it's too large and nonsensical to store in a DataFrame
    dfs.append(df)
In [21]:
# Make a master dataframe for all images using only headers in all dataframes
headers = list(set.intersection(*map(set,dfs)))
df = pd.concat([df[headers] for df in dfs])
df.set_index('filename',inplace=True)
In [22]:
display(df)
name Exposure Index Exposure in uAs Instance Number Study Instance UID Content Time Series Time Study ID Modality Series Date X-Ray Tube Current ... Samples per Pixel High Bit Exposure Specific Character Set Accession Number De-identification Method Code Sequence Body Part Examined Pixel Representation SOP Class UID SOP Instance UID
filename
downloads/419639-005966/1.2.826.0.1.3680043.10.474.419639.238877950213975218363744768567/1.2.826.0.1.3680043.10.474.419639.199185633225969837235422780961/1.2.826.0.1.3680043.10.474.419639.338357498067445660994795729410.dcm 235.73 2200 1 1.2.826.0.1.3680043.10.474.419639.238877950213... 203133.445523 203133.445523 DX 20110508 323 ... 1 11 2 ISO_IR 100 24164 [[(0008, 0100) Code Value ... CHEST 0 1.2.840.10008.5.1.4.1.1.1.1 1.2.826.0.1.3680043.10.474.419639.338357498067...
downloads/419639-005966/1.2.826.0.1.3680043.10.474.419639.136347983342456090026818822110/1.2.826.0.1.3680043.10.474.419639.116290334401168239019633666124/1.2.826.0.1.3680043.10.474.419639.120300845449893409248681861328.dcm 153.86 2000 1 1.2.826.0.1.3680043.10.474.419639.136347983342... 115907.728007 115907.728007 CR 20110510 322 ... 1 11 2 ISO_IR 100 24178 [[(0008, 0100) Code Value ... CHEST 0 1.2.840.10008.5.1.4.1.1.1 1.2.826.0.1.3680043.10.474.419639.120300845449...
downloads/419639-004486/1.2.826.0.1.3680043.10.474.419639.161896168092474395209530500374/1.2.826.0.1.3680043.10.474.419639.241350960080622520462420009922/1.2.826.0.1.3680043.10.474.419639.145110880890384391433274781976.dcm 200.0 1280 1 1.2.826.0.1.3680043.10.474.419639.161896168092... 171609.043 171056.567 DX 20170528 320 ... 1 11 1 ISO_IR 100 19536 [[(0008, 0100) Code Value ... CHEST 0 1.2.840.10008.5.1.4.1.1.1.1 1.2.826.0.1.3680043.10.474.419639.145110880890...
downloads/419639-004486/1.2.826.0.1.3680043.10.474.419639.118613874522385825655419409466/1.2.826.0.1.3680043.10.474.419639.305005715360473208980309632160/1.2.826.0.1.3680043.10.474.419639.532016163989758731316920279042.dcm 182.38 1600 1 1.2.826.0.1.3680043.10.474.419639.118613874522... 102738.306306 102738.306306 DX 20170605 320 ... 1 11 1 ISO_IR 100 20186 [[(0008, 0100) Code Value ... CHEST 0 1.2.840.10008.5.1.4.1.1.1.1 1.2.826.0.1.3680043.10.474.419639.532016163989...
downloads/419639-004486/1.2.826.0.1.3680043.10.474.419639.125810573309217143784890529197/1.2.826.0.1.3680043.10.474.419639.118065679248589286894262755470/1.2.826.0.1.3680043.10.474.419639.222096163603758063408202308357.dcm 245.05 1500 1 1.2.826.0.1.3680043.10.474.419639.125810573309... 090416.195916 090416.195916 DX 20170529 312 ... 1 11 1 ISO_IR 100 19530 [[(0008, 0100) Code Value ... CHEST 0 1.2.840.10008.5.1.4.1.1.1.1 1.2.826.0.1.3680043.10.474.419639.222096163603...
downloads/419639-004486/1.2.826.0.1.3680043.10.474.419639.125810573309217143784890529197/1.2.826.0.1.3680043.10.474.419639.332211068554181067467022145158/1.2.826.0.1.3680043.10.474.419639.203380996083922564313807962791.dcm 245.05 1500 1 1.2.826.0.1.3680043.10.474.419639.125810573309... 090904.610042 090904.610042 DX 20170529 312 ... 1 11 1 ISO_IR 100 19530 [[(0008, 0100) Code Value ... CHEST 0 1.2.840.10008.5.1.4.1.1.1.1 1.2.826.0.1.3680043.10.474.419639.203380996083...
downloads/419639-004486/1.2.826.0.1.3680043.10.474.419639.274623828911318186126754873274/1.2.826.0.1.3680043.10.474.419639.251758278872580821083226828769/1.2.826.0.1.3680043.10.474.419639.129831158929213234512420327031.dcm 120.17 2000 1 1.2.826.0.1.3680043.10.474.419639.274623828911... 155737.445712 155737.445712 DX 20170503 250 ... 1 11 2 ISO_IR 100 16795 [[(0008, 0100) Code Value ... CHEST 0 1.2.840.10008.5.1.4.1.1.1.1 1.2.826.0.1.3680043.10.474.419639.129831158929...
downloads/419639-003484/1.2.826.0.1.3680043.10.474.419639.140316142430659476069988221925/1.2.826.0.1.3680043.10.474.419639.235077257085678341201661347366/1.2.826.0.1.3680043.10.474.419639.250971161552980694447122795609.dcm 293.0 1920 1 1.2.826.0.1.3680043.10.474.419639.140316142430... 020224.787 015457.200 DX 20111013 320 ... 1 11 2 ISO_IR 100 12234 [[(0008, 0100) Code Value ... CHEST 0 1.2.840.10008.5.1.4.1.1.1.1 1.2.826.0.1.3680043.10.474.419639.250971161552...
downloads/419639-003484/1.2.826.0.1.3680043.10.474.419639.210778054999760359188417916669/1.2.826.0.1.3680043.10.474.419639.485051506367402929274868149596/1.2.826.0.1.3680043.10.474.419639.130092207615417146244447563736.dcm 85.87 1600 1 1.2.826.0.1.3680043.10.474.419639.210778054999... 163515.54252 163515.54252 DX 20111013 320 ... 1 11 1 ISO_IR 100 12239 [[(0008, 0100) Code Value ... CHEST 0 1.2.840.10008.5.1.4.1.1.1.1 1.2.826.0.1.3680043.10.474.419639.130092207615...
downloads/10008204-RwVMPdTu0EOZV6oE7Rml5Q/2.16.840.1.114274.1818.514395397152296418914049330214008864917/2.16.840.1.114274.1818.49815354666685421105401695275387637902/2.16.840.1.114274.1818.57232156540098663905951504146530613421.dcm 281.28 2800 1 2.16.840.1.114274.1818.51439539715229641891404... 101126 101126.158 CR 19600101 320 ... 1 11 3 [[(0008, 0100) Code Value ... PORT CHEST 0 1.2.840.10008.5.1.4.1.1.1 2.16.840.1.114274.1818.57232156540098663905951...

10 rows × 53 columns

In [24]:
## Export the file metadata as a TSV file
filename = "MIDRC_DICOM_metadata.tsv"
df.to_csv(filename, sep='\t')

The End¶


If you have any questions related to this notebook don't hesitate to reach out to the MIDRC Helpdesk at midrc-support@datacommons.io or the author directly at cgmeyer@uchicago.edu

Happy data wrangling!

In [ ]: